Main text tables and figures.
Table 1. Comparison of Sparkling Lake, Lake Mendota, and Trout Bog. These three lakes were chosen for comparative metatranscriptomics because of their varying trophic statuses, extensive historical data, and previous microbial sampling. Data on surface area, maximum depth, and development on shoreline courtesy of NTL-LTER <lter.limnology.wisc.edu>. Temperature, dissolved oxygen, pH, and conductivity were measured using a HydroLab DS5x Sonde and are averaged over all sampling depths and timepoints for each lake. Chlorophyll and phycocyanin concentrations were measured from the integrated epilimnion samples using a methanol extraction protocol and averaged over all timepoints. Secchi depth was measured at the first timepoint for each lake. Bacterial production was quantified via C14-leucine incorporation and averaged over all timepoints. Due to thunderstorms the night of July 8th, the final 1AM timepoint in Sparkling Lake was collected on July 9th instead.
| Lake Mendota | Trout Bog | Sparkling Lake | |
|---|---|---|---|
| Surface area (km^2) | 39.600 | 0.001 | 0.637 |
| Maximum depth (m) | 25.3 | 7.9 | 20.0 |
| Trophic status | Eutrophic | Humic | Oligotrophic |
| Location | Madison, WI USA | Boulder Junction, WI USA | Boulder Junction, WI USA |
| GPS Coordinates | 43.1113, -89.4255 | 46.0412, -89.6861 | 46.0091, -89.6695 |
| Shoreline development | High | Low | Moderate |
| Epilimnion sampling depth | 0-7m | 0-1.5m | 0-4m |
| Temperature (C) | 24.61 | 19.51 | 23.33 |
| Dissolved oxygen (mg/L) | 9.71 | 5.33 | 9.25 |
| pH | 8.64 | 4.03 | 7.52 |
| Conductivity (uS/cm) | 608 | 25.87 | 174.54 |
| Total phosphorus (ug/L) | 18.81 | 23.17 | 6.44 |
| Total nitrogen (ug/L) | 625.83 | 667.98 | 346.46 |
| Total dissolved phosphorus (ug/L) | 8.83 | 15.41 | 5.42 |
| Total dissolved nitrogen (ug/L) | 506.7 | 587.51 | 305.17 |
| Chlorophyll (ug/L) | 6.14 | 14.44 | 1.77 |
| Phycocyanin (ug/L) | 0.74 | 1.94 | 3.15 |
| Bacterial production (cpm) | 60.02 | 30.3 | 3.15 |
| Secchi depth (m) | 4.8 | 1.1 | 6.2 |
| Sampling dates (2016) | July 14-16 | July 8-10 | July 6-9 |
| Sunrise/sunset time | 5:32/20:35 | 5:18/20:49 | 5:17/20:50 |
Figure 1. Cyclic trends in Lake Mendota. Cyclic trends with a 12 hour phase were detected in the top 20,000 most expressed genes in each lake using RAIN. Here, we present an example of these cyclic trends in genes related to photosynthesis in Lake Mendota. The percentage of genes in each functional category with significant cyclic trends is reported in Tables 2-4. Read counts were z-score normalized for the purpose of visualization. The phase indicates the peak of maximum expression in the cycle.
Table 2. Gene expression in day vs. night by functional categories in Lake Mendota. We aggregated timepoints by day (9AM, 1PM, 5PM) and night (9PM, 1AM, 5AM) to compare differential gene expression. This analysis includes the top 20,000 most expressed genes. Functional categories were determined based on gene annotations. Genes with cyclic trends were detected using RAIN, while p-values of day vs. night read totals per sample were calculated using a two-tailed t-test. The day/night ratio is the sum of reads assigned to that category in day divided by the sum at night. A ratio greater than one indicates higher expression in day, while a ratio less than one indicates higher expression at night.
| Number of genes | % Genes more expressed in day | % Genes more expressed at night | % Cyclic genes (12 hr phase) | p-value from t-test of day vs. night read totals | Day/night ratio | |
|---|---|---|---|---|---|---|
| Photosynthesis | 637 | 46.47 | 30.61 | 21.51 | 0 | 2.57 |
| Rhodopsins | 124 | 29.03 | 22.58 | 13.71 | 0.04 | 1.37 |
| RuBisCO | 63 | 23.81 | 4.76 | 0 | 0.67 | 1.15 |
| reductive TCA | 14 | 7.14 | 21.43 | 0 | 0.08 | 1.23 |
| Polyamines | 51 | 0 | 43.14 | 0 | 0.41 | 0.9 |
| Reactive oxygen species | 63 | 39.68 | 3.17 | 15.87 | 0 | 1.76 |
| Protease | 252 | 20.63 | 5.16 | 7.54 | 0.02 | 1.22 |
| Ribose transport | 28 | 0 | 46.43 | 0 | 0.02 | 0.75 |
| General sugar transport | 237 | 44.3 | 33.76 | 5.06 | 0.02 | 1.39 |
| Raffinose/stachyose/melibiose transport | 25 | 0 | 68 | 0 | 0 | 0.58 |
| Glucose/mannose transport | 36 | 0 | 16.67 | 0 | 0.48 | 0.9 |
| Rhamnose transport | 11 | 0 | 54.55 | 0 | 0.16 | 0.84 |
| Xylose transport | 45 | 2.22 | 6.67 | 0 | 0.83 | 0.96 |
| Amino acid transport | 258 | 8.91 | 12.02 | 1.94 | 0.4 | 1.08 |
Table 3. Gene expression in day vs. night by functional categories in Trout Bog. We aggregated timepoints by day (9AM, 1PM, 5PM) and night (9PM, 1AM, 5AM) to compare differential gene expression. This analysis includes the top 20,000 most expressed genes. Functional categories were determined based on gene annotations. Genes with cyclic trends were detected using RAIN, while p-values of day vs. night read totals per sample were calculated using a two-tailed t-test. The day/night ratio is the sum of reads assigned to that category in day divided by the sum at night. A ratio greater than one indicates higher expression in day, while a ratio less than one indicates higher expression at night.
| Number of genes | % Genes more expressed in day | % Genes more expressed at night | % Cyclic genes (12 hr phase) | p-value from t-test of day vs. night read totals | Day/night ratio | |
|---|---|---|---|---|---|---|
| Photosynthesis | 324 | 52.78 | 18.52 | 4.01 | 0.01 | 7 |
| RuBisCO | 59 | 32.2 | 3.39 | 1.69 | 0.01 | 7 |
| Glycoside hydrolases | 13 | 23.08 | 15.38 | 0 | 0.66 | 0.94 |
| Polyamines | 19 | 5.26 | 10.53 | 0 | 0.05 | 0.76 |
| Reactive oxygen species | 58 | 29.31 | 10.34 | 0 | 0.21 | 1.22 |
| Protease | 231 | 18.61 | 12.12 | 0 | 0.05 | 1.56 |
| Ribose transport | 43 | 0 | 46.51 | 0 | 0 | 0.42 |
| General sugar transport | 63 | 6.35 | 49.21 | 3.17 | 0 | 0.55 |
| Xylose transport | 15 | 0 | 33.33 | 0 | 0.02 | 0.58 |
| Methane/ammonia monooxygenase | 24 | 12.5 | 33.33 | 0 | 0.06 | 0.67 |
| Amino acid transport | 101 | 2.97 | 24.75 | 0 | 0.02 | 0.67 |
Table 4. Gene expression in day vs. night by functional categories in Sparkling Lake. We aggregated timepoints by day (9AM, 1PM, 5PM) and night (9PM, 1AM, 5AM) to compare differential gene expression. This analysis includes the top 20,000 most expressed genes. Functional categories were determined based on gene annotations. Genes with cyclic trends were detected using RAIN, while p-values of day vs. night read totals per sample were calculated using a two-tailed t-test. The day/night ratio is the sum of reads assigned to that category in day divided by the sum at night. A ratio greater than one indicates higher expression in day, while a ratio less than one indicates higher expression at night.
| Number of genes | % Genes more expressed in day | % Genes more expressed at night | % Cyclic genes (12 hr phase) | p-value from t-test of day vs. night read totals | Day/night ratio | |
|---|---|---|---|---|---|---|
| Photosynthesis | 573 | 30.89 | 10.47 | 16.23 | 0 | 2.76 |
| Rhodopsins | 95 | 6.32 | 1.05 | 2.11 | 0.77 | 0.95 |
| RuBisCO | 97 | 0 | 1.03 | 0 | 0.08 | 0.66 |
| Polyamines | 23 | 4.35 | 13.04 | 0 | 0.18 | 0.8 |
| Alkaline phosphatase | 12 | 0 | 0 | 0 | 0.13 | 0.62 |
| Reactive oxygen species | 68 | 29.41 | 1.47 | 14.71 | 0 | 1.87 |
| Protease | 278 | 10.79 | 0 | 2.16 | 0.14 | 1.46 |
| Carboxylate transport | 27 | 7.41 | 0 | 3.7 | 0.15 | 1.26 |
| Ribose transport | 13 | 0 | 7.69 | 0 | 0.07 | 0.69 |
| General sugar transport | 102 | 6.86 | 3.92 | 1.96 | 0.48 | 0.88 |
| Raffinose/stachyose/melibiose transport | 11 | 0 | 0 | 0 | 0.02 | 0.53 |
| Glucose/mannose transport | 15 | 0 | 0 | 0 | 0.14 | 0.77 |
| Xylose transport | 17 | 0 | 0 | 0 | 0.28 | 0.76 |
| Fructose transport | 13 | 0 | 0 | 0 | 0.09 | 0.69 |
| Amino acid transport | 158 | 3.16 | 0 | 0.63 | 0.31 | 1.18 |
Figure 2. Taxonomic assignments of functional categories with significant differential expression in day vs. night in Lake Mendota. Categories found to have significant differential expression (see Table 2) were further assessed for differences in the taxonomyic assignments of genes expressed in day vs. night. With the exception of general sugar transport and rhodopsins, taxonomy profiles were remarkably similar in day vs. night, indicating that microbes expressing these functions act in concert rather than partitioning by time. The best available phylum-level taxonomic assignment for each gene was used in this analysis; this was calculated for MAGs when available, at the contig level when contigs were unbinned, and at the gene level when contigs were too short to classify. Proteobacteria were split into classes due to the high diversity of this phylum.
Figure 3. Taxonomic assignments of functional categories with significant differential expression in day vs. night in Trout Bog Lake. Categories found to have significant differential expression (see Table 2) were further assessed for differences in the taxonomyic assignments of genes expressed in day vs. night. With the exception of photosynthesis and RuBisCO, taxonomy profiles were remarkably similar in day vs. night, indicating that microbes expressing these functions act in concert rather than partitioning by time. The best available phylum-level taxonomic assignment for each gene was used in this analysis; this was calculated for MAGs when available, at the contig level when contigs were unbinned, and at the gene level when contigs were too short to classify. Proteobacteria were split into classes due to the high diversity of this phylum.
Figure 4. Taxonomic assignments of functional categories with significant differential expression in day vs. night in Parkling Lake. Categories found to have significant differential expression (see Table 4) were further assessed for differences in the taxonomyic assignments of genes expressed in day vs. night. Taxonomy profiles were remarkably similar in day vs. night, indicating that microbes expressing these functions act in concert rather than partitioning by time. The best available phylum-level taxonomic assignment for each gene was used in this analysis; this was calculated for MAGs when available, at the contig level when contigs were unbinned, and at the gene level when contigs were too short to classify. Proteobacteria were split into classes due to the high diversity of this phylum.
Supp Figure SX. Top 10 most expressed genes in each study site. Coding regions from reference genomes and metagenome assemblies were clustered at 97% sequence similarity, and the longest coding region was chosen as the representative sequence. Metatranscriptomic reads were mapped to these representative sequences. Annotations and classifications are derived from the sequence to which each read mapped. Read counts were summed across each lake and log transformed for visualization.
Supp Figure SX. Top 10 most expressed annotated heterotrophic genes in each study site. Coding regions from reference genomes and metagenome assemblies were clustered at 97% sequence similarity, and the longest coding region was chosen as the representative sequence. Metatranscriptomic reads were mapped to these representative sequences. The top 10 most expressed genes from each lake, filtered to exclude photosynthetic genes, phototrophic organisms, hypothetical genes, and unclassified genes, are presented here. Annotations and classifications are derived from the sequence to which each read mapped. Read counts were summed across each lake and log transformed for visualization.
Figure SX. Abundance vs. expression by phylum and lake. To determine which phyla were most abundant or most expressed during our time series, we analyzed metagenomic and metatranscriptomic read counts. The expression of clustered, nonredundant genes was aggregated by phylum and compared to the coverage of those phyla in metagenomes. Genes that could not be classified in a phylum were not included in this analysis. No positive relationship was observed between expression and abundance. One phylum, Chloroflexi, was removed from the plot of Lake Mendota due to orders of magnitude higher expression and abundance. This phylum is likely an outlier.
Figure SX. Assessing the variability of metatranscriptomic read counts. One aim of this metatranscriptomic study was to provide information on the variability in gene expression in freshwater that could be used to guide further metatranscriptomic experiments. We calculated the coefficient of variance (CoV) for each gene both within replicates and across samples (A). CoV was lower within replicates than across replicates, indicating that variation across replicates is not technical. High levels of variability were observed in genes from all three lakes. In panel B, three example eigenvectors of genes are shown, including trends for 1) on one day, off the next, 2) up and down across the time series, and 3) observed in only one timepoint.
## Warning: Removed 7 rows containing non-finite values (stat_boxplot).
Figure SX. Photosynthetically active radiation by time and lake. Measurements of photosynthetically active radiation collected just below the surface of the water prior to each RNA collection event guided our decision to categorize samples collected at 9:00, 13:00, and 17:00 as “day” and samples collected at 21:00, 5:00, and 1:00 as “night”. Sunrise and sunset times for each lake are reported in Table 1.